A UN Datathon Story
EBS, Monash
EBS, Monash
Education, Melbourne
Maths, QUT
Maths & Stats, USyd
February 29, 2024
Aim: Slap together a half-baked solution in 3 days.
Problem: Globally, nearly a billion people lack reliable energy sources, and solar is a cost-effective way for this demand to be fulfilled.
Solution: Map areas of the globe that solar farm investment would be successful in, by using existing solar farms as training data; overlay that onto a map of energy demand, proxied by night light data.
| Quantity | Source | Provided/Extracted Format |
|---|---|---|
| Population density | Google Earth Engine, provided by Oak Ridge National Laboratory | tiff |
| Night light intensity | NASA, Earth at Night project | tiff |
| Biomass/land use | NASA | tiff |
| Terrain slope | Google Earth Engine, provided by USGS | tiff |
| Photovoltaic potential | Global Solar Atlas | tiff |
| Solar farm locations | S. Dunnett, hosted on awesome-gee-community-catalog and figshare | csv |
Data was all remapped from their raw forms onto a consistent grid.
rasterGrid = raster(ncols = 3600, nrows = 1800,
xmn = -180, xmx = 180,
ymn = -90, ymx = 90)
baseRaster = terra::rast(rasterGrid)
rawValues = terra::rast(tiffFile)
consistentValues = resample(rawValues, baseRaster, method = "bilinear")
valueDataFrame = as.data.frame(consistentValues, xy = TRUE, na.rm = FALSE) %>%
mutate(id = 1:ncell(consistenValues))Regress per-area power production of existing solar farm locations on a laughably small number of factors (photovoltaic potential, land use, terrain slope).
Using “spatial” “random forest”.
Demand was modelled using a proxy quantity constructed from night light intensity and population density
So none of us had much experience with spatial data.
UN provided “data sources” - but it was just a shotgun list of other lists
Most of it was spent collecting and sourcing data.
Initial focus was on Africa, but we couldn’t find nice shape files or very local data for the region.
Limitations: - We wanted to find spatial data at a lower-than-country resolution over a global span.
End of day: - First model - Models were trained overnight
R Shiny app was being built
Model was being iterated on Day 3
Submitted at 6.30pm People watched until 9pm Everywhere good for food was closed or in the process of closing
Impress people with fancy graphics.
Communicating the trash you have assembled is (more) important (than the quality of trash you collect)
R and Python can work together
Spatial data is a pain in the ass to work with
QUT Centre for Data Science
ADSN
George
Farhan
Sundance
Jamie
Tim
Michael